Large Experiment and Evaluation Tool for WEKA Classifiers
نویسندگان
چکیده
This paper presents a new Windows®-based software utility for WEKA, a data mining software workbench, to simplify large-scale experiment and evaluation with many algorithms and datasets in the classification context. The proposed tool, LEET (Large Experiment and Evaluation Tool) makes it possible to accomplish a variety of tasks that are presently rather difficult or impractical through the standard WEKA interfaces. This includes allowing comparison of classifiers across multiple experiments, tracking execution time, calculating diversity measures, and summarizing the characteristics of many datasets. We have tested and validated LEET as part of a study with 50+ machine learning classification/ensemble algorithms, 46 datasets, and calculation of a variety of performance measures. With WEKA providing the algorithm implementations, LEET facilitates the execution and evaluation of large-scale experiments with greater ease than any existing interface.
منابع مشابه
MEKA: A Multi-label/Multi-target Extension to WEKA
Multi-label classification has rapidly attracted interest in the machine learning literature, and there are now a large number and considerable variety of methods for this type of learning. We present Meka: an open-source Java framework based on the well-known Weka library. Meka provides interfaces to facilitate practical application, and a wealth of multi-label classifiers, evaluation metrics,...
متن کاملMicrosoft Word - Finding More Non-supersingular Elliptic Curves for Pairing..
Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques with classifiers such as random forests, neural networks and support vector machines. The data sets are ...
متن کاملADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION
With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...
متن کاملDiscretizing Continuous Features for Naive Bayes and C4.5 Classifiers
In this work, popular discretization techniques for continuous features in data sets are surveyed, and a new one based on equal width binning and error minimization is introduced. This discretization technique is implemented for the UCI Machine Learning Repository [7] dataset, Adult database and tested on two classifiers from WEKA tool [6], NaiveBayes and J48. Relative performance changes for t...
متن کاملPerformance Comparison of Naïve Bayes and J48 Classification Algorithms
Classification is an important data mining technique with broad applications. It classifies data of various kinds. Classification is used in every field of our life. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. This paper has been carried out to make a performance evaluation of Naïve Bayes and j48 classification algorithm. Naive ...
متن کامل